579 research outputs found

    Federated Ensemble Regression Using Classification

    Get PDF
    Ensemble learning has been shown to significantly improve predictive accuracy in a variety of machine learning problems. For a given predictive task, the goal of ensemble learning is to improve predictive accuracy by combining the predictive power of multiple models. In this paper, we present an ensemble learning algorithm for regression problems which leverages the distribution of the samples in a learning set to achieve improved performance. We apply the proposed algorithm to a problem in precision medicine where the goal is to predict drug perturbation effects on genes in cancer cell lines. The proposed approach significantly outperforms the base case

    COSMIC 2005

    Get PDF
    The Catalogue Of Somatic Mutations In Cancer (COSMIC) database and web site was developed to preserve somatic mutation data and share it with the community. Over the past 25 years, approximately 350 cancer genes have been identified, of which 311 are somatically mutated. COSMIC has been expanded and now holds data previously reported in the scientific literature for 28 known cancer genes. In addition, there is data from the systematic sequencing of 518 protein kinase genes. The total gene count in COSMIC stands at 538; 25 have a mutation frequency above 5% in one or more tumour type, no mutations were found in 333 genes and 180 are rarely mutated with frequencies <5% in any tumour set. The COSMIC web site has been expanded to give more views and summaries of the data and provide faster query routes and downloads. In addition, there is a new section describing mutations found through a screen of known cancer genes in 728 cancer cell lines including the NCI-60 set of cancer cell lines

    Complete loss of TP53 and RB1 is associated with complex genome and low immune infiltrate in pleomorphic rhabdomyosarcoma

    Get PDF
    Rhabdomyosarcoma accounts for roughly 1% of adult sarcomas, with pleomorphic rhabdomyosarcoma (PRMS) as the most common subtype. Survival outcomes remain poor for patients with PRMS, and little is known about the molecular drivers of this disease. To better characterize PRMS, we performed a broad array of genomic and immunostaining analyses on 25 patient samples. In terms of gene expression and methylation, PRMS clustered more closely with other complex karyotype sarcomas than with pediatric alveolar and embryonal rhabdomyosarcoma. Immune infiltrate levels in PRMS were among the highest observed in multiple sarcoma types and contrasted with low levels in other rhabdomyosarcoma subtypes. Lower immune infiltrate was associated with complete loss of both TP53 and RB1. This comprehensive characterization of the genetic, epigenetic, and immune landscape of PRMS provides a roadmap for improved prognostications and therapeutic exploration

    MethCancerDB – aberrant DNA methylation in human cancer

    Get PDF
    Early detection, classification and prognosis of human cancers by analysis of CpG methylation carry huge diagnostic potential. MethCancerDB collects and annotates genes and sequences from the abundance of published methylation studies and interlinks them to all methylation-relevant bioinformatical resources. MethCancerDB starts with 4720 entries from 348 sources and is freely accessible at http://www.methcancerdb.net

    Deriving a mutation index of carcinogenicity using protein structure and protein interfaces

    Get PDF
    With the advent of Next Generation Sequencing the identification of mutations in the genomes of healthy and diseased tissues has become commonplace. While much progress has been made to elucidate the aetiology of disease processes in cancer, the contributions to disease that many individual mutations make remain to be characterised and their downstream consequences on cancer phenotypes remain to be understood. Missense mutations commonly occur in cancers and their consequences remain challenging to predict. However, this knowledge is becoming more vital, for both assessing disease progression and for stratifying drug treatment regimes. Coupled with structural data, comprehensive genomic databases of mutations such as the 1000 Genomes project and COSMIC give an opportunity to investigate general principles of how cancer mutations disrupt proteins and their interactions at the molecular and network level. We describe a comprehensive comparison of cancer and neutral missense mutations; by combining features derived from structural and interface properties we have developed a carcinogenicity predictor, InCa (Index of Carcinogenicity). Upon comparison with other methods, we observe that InCa can predict mutations that might not be detected by other methods. We also discuss general limitations shared by all predictors that attempt to predict driver mutations and discuss how this could impact high-throughput predictions. A web interface to a server implementation is publicly available at http://inca.icr.ac.uk/

    JISTIC: Identification of Significant Targets in Cancer

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Cancer is caused through a multistep process, in which a succession of genetic changes, each conferring a competitive advantage for growth and proliferation, leads to the progressive conversion of normal human cells into malignant cancer cells. Interrogation of cancer genomes holds the promise of understanding this process, thus revolutionizing cancer research and treatment. As datasets measuring copy number aberrations in tumors accumulate, a major challenge has become to distinguish between those mutations that drive the cancer versus those passenger mutations that have no effect.</p> <p>Results</p> <p>We present JISTIC, a tool for analyzing datasets of genome-wide copy number variation to identify driver aberrations in cancer. JISTIC is an improvement over the widely used GISTIC algorithm. We compared the performance of JISTIC versus GISTIC on a dataset of glioblastoma copy number variation, JISTIC finds 173 significant regions, whereas GISTIC only finds 103 significant regions. Importantly, the additional regions detected by JISTIC are enriched for oncogenes and genes involved in cell-cycle and proliferation.</p> <p>Conclusions</p> <p>JISTIC is an easy-to-install platform independent implementation of GISTIC that outperforms the original algorithm detecting more relevant candidate genes and regions. The software and documentation are freely available and can be found at: <url>http://www.c2b2.columbia.edu/danapeerlab/html/software.html</url></p

    STK295900, a Dual Inhibitor of Topoisomerase 1 and 2, Induces G<inf>2</inf> Arrest in the Absence of DNA Damage

    Get PDF
    STK295900, a small synthetic molecule belonging to a class of symmetric bibenzimidazoles, exhibits antiproliferative activity against various human cancer cell lines from different origins. Examining the effect of STK295900 in HeLa cells indicates that it induces G2 phase arrest without invoking DNA damage. Further analysis shows that STK295900 inhibits DNA relaxation that is mediated by topoisomerase 1 (Top 1) and topoisomerase 2 (Top 2) in vitro. In addition, STK295900 also exhibits protective effect against DNA damage induced by camptothecin. However, STK295900 does not affect etoposide-induced DNA damage. Moreover, STK295900 preferentially exerts cytotoxic effect on cancer cell lines while camptothecin, etoposide, and Hoechst 33342 affected both cancer and normal cells. Therefore, STK295900 has a potential to be developed as an anticancer chemotherapeutic agent. Β© 2013 Kim et al

    Cancer somatic mutations cluster in a subset of regulatory sites predicted from the ENCODE data

    Get PDF
    Background: Transcriptional regulation of gene expression is essential for cellular differentiation and function, and defects in the process are associated with cancer. The ENCODE project has mapped potential regulatory sites across the complete genome in many cell types, and these regions have been shown to harbour many of the somatic mutations that occur in cancer cells, suggesting that their effects may drive cancer initiation and development. The ENCODE data suggests a very large number of regulatory sites, and methods are needed to identify those that are most relevant and to connect them to the genes that they control. Methods: Predictive models of gene expression were developed by integrating the ENCODE data for regulation, including transcription factor binding and DNase1 hypersensitivity, with RNA-seq data for gene expression. A penalized regression method was used to identify the most predictive potential regulatory sites for each transcript. Known cancer somatic mutations from the COSMIC database were mapped to potential regulatory sites, and we examined differences in the mapping frequencies associated with sites chosen in regulatory models and other (rejected) sites. The effects of potential confounders, for example replication timing, were considered. Results: Cancer somatic mutations preferentially occupy those regulatory regions chosen in our models as most predictive of gene expression. Conclusion: Our methods have identified a significantly reduced set of regulatory sites that are enriched in cancer somatic mutations and are more predictive of gene expression. This has significance for the mechanistic interpretation of cancer mutations, and the understanding of genetic regulation

    Low frequency of somatic mutations in the FH/multiple cutaneous leiomyomatosis gene in sporadic leiomyosarcomas and uterine leiomyomas

    Get PDF
    Germline mutations in the fumarate hydratase gene at 1q43 predispose to dominantly inherited skin and uterine leiomyomata and leiomyosarcomas. The enzyme, which is a component of the tricarboxylic acid cycle, acts as a tumour suppressor. To evaluate fumarate hydratase in respective sporadic tumours, we analysed a series of 26 leiomyosarcomas and 129 uterine leiomyomas (from 21 patients) for somatic mutations in fumarate hydratase and allelic imbalance around 1q43. None of the 26 leiomyosarcomas harboured somatic mutations in fumarate hydratase. Fifty per cent of leiomysarcomas tested showed evidence of allelic imbalance at 1q, but this was not confined to the vicinity of fumarate hydratase. Only 5% (seven out of 129) of the leiomyomas showed allele imbalance at 1q42-q43 and no somatic mutations in fumarate hydratase were observed. Our findings indicate that mutations in fumarate hydratase do not play a major role in the development of sporadic leiomyosarcomas or uterine leiomyomas

    Overexpression of Full-Length ETV1 Transcripts in Clinical Prostate Cancer Due to Gene Translocation

    Get PDF
    ETV1 is overexpressed in a subset of clinical prostate cancers as a fusion transcript with many different partners. However, ETV1 can also be overexpressed as a full-length transcript. Full-length ETV1 protein functions differently from truncated ETV1 produced by fusion genes. In this study we describe the genetic background of full-length ETV1 overexpression and the biological properties of different full-length ETV1 isoforms in prostate cancer. Break-apart FISH showed in five out of six patient samples with overexpression of full-length ETV1 a genomic rearrangement of the gene, indicating frequent translocation. We were able to study the rearrangements in more detail in two tumors. In the first tumor 5β€²-RACE on cDNA showed linkage of the complete ETV1 transcript to the first exon of a prostate-specific two exon ncRNA gene that maps on chromosome 14 (EST14). This resulted in the expression of both full-length ETV1 transcripts and EST14-ETV1 fusion transcripts. In chromosome spreads of a xenograft derived from the second prostate cancer we observed a complex ETV1 translocation involving a chromosome 7 fragment that harbors ETV1 and fragments of chromosomes 4 and 10. Further studies revealed the overexpression of several different full-length transcripts, giving rise to four protein isoforms with different N-terminal regions. Even the shortest isoform synthesized by full-length ETV1 stimulated in vitro anchorage-independent growth of PNT2C2 prostate cells. This contrasts the lack of activity of even shorter N-truncated ETV1 produced by fusion transcripts. Our findings that in clinical prostate cancer overexpression of full-length ETV1 is due to genomic rearrangements involving different chromosomes and the identification of a shortened biologically active ETV1 isoform are highly relevant for understanding the mechanism of ETV1 function in prostate cancer
    • …
    corecore